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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: 

(A) NAME: INSTITUT PASTEUR 

(B) STREET: 28 RUE DU DOCTEUR ROUX 
<C> CITY: PARIS CEDEX 15 

( E ) COUNTRY : FRANCE 

<F) POSTAL CODE (ZIP) : 75724 

(ii) TITLE OF INVENTION: A METHOD FOR ISOLATING A POLYNUCLEOTIDE OF 
INTEREST FROM THE GENOME OF A MYCOBACTERIUM USING A 
BAC-BASED DNA LIBRARY. APPLICATION TO THE DETECTION OF 
MYCOBACTERIA. 

(iii) NUMBER OF SEQUENCES: 5 

(iv) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.30 <EPO) 

( 2 ) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 732 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1: 

ACCTGCGCTT GCAGAGATCA AATAGGGCGC ATGGGTC AG C ATAGTACAGG TCGTCGCGCA 60 

TCTTTGATGC ATCGGAATAA GATGTCAGGC AATTAAAAGA GAAGCCACGG CGACTCGCGG 120 

CATTCAGCAT GTCGAGCGTC GCTTCGATGT GAGCGCACCA TTCCGTGTCC AACGATTTCA 180 

GACGAACATT GAATATTCCA CTCGCGACGC TATAGTCCGC CTCCCGATCT ATGCGCGCCG 24 0 

CGCAGATGAA GTCTGCGTTC GCCCGACCTT CGAAACGTAG TGCGGCCGCG CGCACCATTT 3 00 

CGGGGGAGAC GTCGATGCCG GTGTAATCAG TTTTGAAGCC ACGCGCATCT AGGTAGTCCA 360 

GTAGAGCCCC ATAGCCACAG CCTAGATCGT TGATCGAAAA TGGGTCCGCC GCATTGACAA 42 0 

TGCGCACCAG CTGGTCAAAG CGCAACGCCT GCCCGGCTTC GCCGTTCCAA TCGACGCCGC 4 80 

GCGGGTGCCG TGTGCTTCGA GTTTCGATGC GTAGTAACGG GCCACGTCAG CGAGCATGGT 54 0 
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CGTTGCGTCT TCCGCCATGA AGCTGCCTCA CGATTTGTGT GTGTGGGCGT CGGTGCGTGG 60 0 

GTCCGAGACT ATAC CTTCAA CAGTTGCATG CCGAGGCTGC GGCGGGCAAT GACCCAAAAA 66 0 

CCCGCCGGCA CGGTTCGCCG AGCAAGGAAG CGTGGAGACG ATAGATAATT TCACTGGCGA 720 

CAGTACCTCA AATAGTCCGG AGCCTCGGCT CCGACGTTAA AGAGCAGATC CAGAATCGAC 78 0 

ACGGCGGGCT CGAACCCTCC CCACAATTGC TTATAATCGC GGTAGCCGTC ATAATCGAAC 84 0 

CAAGTTACCC GGATGCTAAG TTCGTCGAAC ACGCGCTCAT CGACATACGA ACGGG CTGAG 900 

GGGCCAGAGA CATATTCGGT CGCTGCGGCC TGTTGGCAGA GGTTGGCCAG TCTCTCGGTC 960 
TTGCCGTCGG CTAATTCGTA GTCCCACGAA TTTGCCAGTC GCGTGCTGAT ACCGAGATAA . 102 0 

CTGCAAATCG CATTCAATAG ACGCCTGTTG AGTAAGGAAA GATTCGTGTG CTGTTCTTCG 108 0 

AGGTAAATCG GCGCGAGCCA GTCAGCGATC TCCGCAAAAT GAGCGGCCGC GCTGTAGTTG 114 0 

AATTCTAGTG CCCGCCAGTG CGCTTTCGCC CAATCGGTGC CGTCGATCAG CGTCTCACGT 12 0 0 

ATCTTTTGAT GGAAACGTCC CTTCACCTGG ACGGGAACAG TTATCCACTG TAACCCCTGG 1260 

CTCGTTTTGA TCCGATTTCT GTTTCGCCAA TCACGCTTGG TATATTG CAT GTCATCATAG 132 0 

ATGATGAATT CATCGACGAA TGCAATCAGG TCAAAATATC CTCGCCAAGG TATGTAATTT 1380 

GATTGAACAA TCGCGACTTT CTTCAACGCG GTGTCTCCAA TTTAGAATAA CAAATACGTC 144 0 

GCGCCCGCGA CAGCTCCGCT GGAGCGAGTT CAAGCGATTC TGCGACATAT TCAATATGGT 15 00 

GCTCGGGAAG GCCAGGATGG GCCGCGACCC GGGGCGTCCG GTGCGCGATG AACGTCGCAT 1560 

CGTCTCCTGT GAGATAATTG CATCCGATCA TATAGGGCTG GCTGCGGCTA GGTTGCTGGC 162 0 

AAAAAGATAT CGCGGCCGAT CCGTTTCTGG TTTTGTCTTG ATGATCAAAT CCGCTTCCGT 168 0 

TCACGAGATC GATTCCTGGT CTTCCCCCAG CGTCGCGATG TCGATAGGTG TCGCGCTTTG 1740 

TTCGTACCCG CACTACGCGG CGGCGAGAAC CTCGCCACCG AATCGGGATT GGGGGGAGGA 18 00 

TACCACTCGG TCGAGGCCCG TCACCGGCCT TCTAGCGGGT TGACCATCAG TGTTTGCAGG 186 0 

GCCCTATCCC GGTATGGCGC ACCACGGGAT CGG CAGCGTT CCGGTTGCTG GCGTGGTACC 192 0 

TCGTTGTGGC GCCGTGGTCC ATGTCGATTG AGTGCGTGGA TCAGTGTAAA CCGTTGCGCG 1980 

CCATGTTCTG TAGGCACTGG TTCGGGTTGT GGTTAGGCTG CACGGTTGGC AGGTTACCAA 2 040 

CCACTGAGCC CCTGGGCGGA TGTGAGCTCG GACTCCGCCT ATGGGGTGTA ATTTTGG C AG 2100 

ATTGGG CCGG GTCCCCGTGG TGAGGACTCC TCAACCGGAT TGGGTAAGCA TGAGGTGGTG 216 0 

CTGGCAGCGG TGTCCTGGTC GCTCTCCCGA~ GTAGGCCCGT TGTGACTGTC ATGTGGGCGA 222 0 
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GCGGGTTTGC 


GCGCGTAGGA 


GACGATGATT 


ACTACGCACG 


TGACCAACCA 


> CAAGAACGGT 


2280 


GCCCATGTCA 


CCGTGGTGAA 


AACGAGTGGC 


GTGGTACCGA 


CTACCCCTTT 


GGCTCCCAGC 


2340 


TGTCCATAGA 


GCGGCACGTA 


GAACGGCTGG 


CCCGGGACCG 


CGACGTTGAC 


GATGCTCAGC 


2400 


GCCACGGCCA 


AACTCACGCA 


GACGCCGACC 


GCGCGGCGGC 


GGTCTCCATG 


GGCTGCGAGT 


2460 


TGGTCGAATA 


TCCCAGCACC 


AGGAGGCCCG 


TTGGGGTCTC 


GGGCTACCAG 


TGCAGCGATT 


2520 


GGCAAGACGA 


AAACGAGATA 


GTAGAAGGCG 


ACGTCCGCGG 


GGGAGAAGGT 


GGCGGTGGCG 


2580 


AGCAACACAA 


TCCCCACCAT 


GACAGGCGGG 


ATACGGCGTC 


CGAGCGCCAG 


CACGGCGACC 


2640 


ACGACTATGA 


CTAGGACAGC 


AAACCCGATC 


TGCGTTCGCG 


GACCAGTGAG 


GAAACCCTCT 


2700 


GGGATCTTGC 


CCGATTGATA 


GTTCTTGATG 


CTATCGGGGA 


TCAGCAGGAG 


TGCCTTGCCA 


2760 


AAGGACACGT 


TCCGCGGGTC 


TCGAAGCCCT 


CCGAACGAAC 


TATTGAACTT 


GATGATGCCG 


2820 


TGGATCGACT 


GTGCGATCGT 


CCCCGGGAAG 


CCTCGTGGCC 


ACAACAGAAA 


GGCTGCGATA 


2880 


TTGGACACCA 


CCACGCCGGT 


GATCCCGATA 


CCAGCCCACC 


GCCATTGTCG 


AGCCGCCAAC 


2940 


AACACCACGC 


CGAGAACGAC 


GAACTGCGGC 


TTTACCAGGA 


CGGCCAAGAT 


CACCGTGATG 


3000 


GTGGCGAGGC 


CCCACCGCTG 


TCGGGACAAC 


GCCACGAAGT 


AAGCCAGCGC 


GATCGGTACC 


3060 


ACGAACCCTG 


TCGAGTTGCC 


TCGATCGATG 


ACCCCCCACG 


CCGGGATGGC 


CGCGGCGCCC 


3120 


AGTGT CACG A 


AGATGACCAC 


TCGCTCCAGA 


CCACGTGCCC 


CCCGGGCCGC 


CCAGATGGCG 


3180 


GGAGATATGA 


CCGCCATCGT 


TAGGGCGACC 


AGGTAACAGA 


TCAGCCCCAA 


GCGCGGCGCA 


3240 


CCCAGCCAAT 


GGCTGGGTAG 


TCCGAAAATC 


GCATACGGTA 


TGCGGGCGGG 


GGCCCATGCA 


3300 


GCAACCGCGG 


TCGGCTGGTA 


ATCGGCGGGT 


AGCGAGATCA 


GGTAGTCCGC 


GGGATTGGGT 


3360 


TGAATCCCGG 


CGGCGGCGAC 


CATGGCGTAG 


TCGCTGAAGC 


AGTGCCGACC 


GATATTCATG 


3420 


CCCCAATCAA 


GCCAACAGTC 


CCCAGGGACT 


ACCAAAAGAG 


TGGAAAAGAC 


GTCGACCGCG 


3480 


TACCACTGAC 


TGAGGGCGTA 


CGCCGTCGCC 


GCCGAAATCA 


CCGACGCCAG 


CAGGATGGTG 


3540 


CCGAGCATGA 


GGGTGCGCTC 


GGATTGGGAG 


CCGATCGCCC 


AGAGCCGCTC 


CCGGCTCGCG 


3600 


GTCACGGCAC 


CGCGCAACAC 




V-O Li 1 itl 


IjLiAI I CTCCT 


CGGTTCTGCG 


3660 


CGAAACGGTA 


GCAGAGCGCC 


ATGGTTGCCA 


ACGCGGTCGC 


CGGGCAGTCT 


AGACCGGATC 


3720 


TTCCTCGTGG 


CAACCGACAA 


CAGGACGTCG 


TTGCCGAAAG 


GGCGCTGGGC 


ACCGACATCT 


3780 


AGGATGAACC 


CACAGCCACG 


CCCCGACGTT 


ATGCCATGGC 


GAAGAGCGAC 


CGGCAGGAGC 


3840 


GGGAACCCAG 


TGAAGCGAGC 


GCTCATCACC 


GGAATCACAG 


GACCGGACGG 


CTCGTATCTC 


3900 


GCTAAGCTCC 


CGCTGAAGGG 


ATATGTGGCC 


GCTGGTAGCC 


CGGCCGAGGT 


CTATTTCTGC 


3960 
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TGGGCGACAC GGAATTATCG CGAATTGTAT GGGTTGCTCG CGGTCAACAG CATCTGGTTC 4 02 0 

AATCACGAAT CACCGCGTCA CGGCGAGACA TTCATGACTC GTAATCCTGC AC CATATCGC 4 080 

GGTCGGCAAC GAGGCGCTGA TCGATGCGCA GACGCTGATG CGCCGGCCCA CCCGGATAGG 414 0 

TATCAGTATT GGGGCGTTCC GGCCAGCGTA CGAGGCGTGA TCGACCGCGC AATGGGTGTT 42 00 

TGCGTTGAGT AATAATCTGA AC CGTGTGAA CGCATGCATG GATGGATTCC TTGCCCGTAT 4260 

CCGCTCACAT GTTGATGCGC ACGCGCCAGA ATTGCGTTCA CTGTTCGATA CGATGGCGGC 4 320 

CGAGGCCCGA TTTGCACGCG ACTGGCTGTC CGAGGACCTC GCGCGGTTGC CTGTCGGTGC 4380 

AGCATTGCTG GAAGTGGGCG GGGGGGTACT TCTGCTCAGC TGT CAACTGG CGGCGGAGGG 444 0 

ATTTGACATC ACCGCCATCG AGCCGACGGG TGAAGGTTTT GGCAAGTTCA G AC AG CTTGG 4 500 

CGACATCGTG CTGGAATTGG CTGCAGCACG ACCCACCATC GCGCCATGCA AGGCGGAAGA 4 56 0 

CTTTATTTCC GAGAAGCGGT TCGACTTCGC CTTCTCGCTG AATGTGATGG AGCACATCGA 4620 

CCTTCCGGAT GAGGCAGTCA GGCGGGTATC GGAAGTGCTG AAACCGGGGG CCAGTTACCA 4680 

CTTCCTGTGC CCGAATTACG TATTCCCGTA CGAACCGCAT TTCAATATCC CAACATTCTT 474 0 

CACCAAAGAG CTGACATGCC GGGTGATGCG ACATCGCATC GAGGGCAATA CGGGCATGGA 48 00 

TGACCCGAAG GGAGTCTGGC GTTCGCTCAA CTGGATTACG GTTCCCAAGG TGAAACGCTT 4 860 

TGCGGCGAAG GATGCGACGC TGACCTTGCG CTTCCACCGT GCAATGTTGG TATGGATGCT 4 92 0 

GGAACGCGCG CTGACGGATA AGGAATTCGC TGGTCGCCGG GCACAATGGA TGGTCGCTGC 4 98 0 

TATTCGCTCG GCGGTGAAAT TGCGTGTGCA TCATCTGGCA GGCTATGTTC CCGCTACGCT 504 0 

GCAGCCCATC ATGGATGTGC GG CTAACGAA GAGGTAATGA CATGGCGCAA GCGACATCGG 5100 

GCATTCGCGC GGCACTTTCG CAACCTGCTG TGTATGAGGC GTATCAGCGG ATTGCGGGCG 5160 

CTAAAAGCGG GCTTGCGTGG ATCACAACCG ACCCCATCCA GTCGTTGCCA GGCATGCGTA 522 0 

CTCTCGACCT CGGTTGCTGG CCAGCGGTGA TACACAGCTC CCCGCCAGTG GACGTGACAT 5280 

GTACGAGAGA CGGCATGAGC GCGGAATGTG CGACCGTGCC GTCGAGATGA CCGACGTCGG 5340 

CGCTACGGCA GCCCCCACCG GACCTATCGC GCGGGGCAGC GTCGCTCGGG TCGGCGCGGC 54 00 

GACCGCGTTG GCCGTTGCCT GCGTCTACAC GGTCATCTAT CTGGCGGCCC GCGACCTACC 5460 

CCCGGCTTGT TTTTCGATAT TCGCGGTGTT TTGGGGGGCG CTCGGCATTG CCACCGGCGC 552 0 

CACCCACGGC CTCCTGCAAG AAACGACCCG CGAGGTCCGC TGGGTGCGCT CCACCCAAAT 558 0 

AGTTGCGGGC CATCGTACCC ATCCGCTGCG • GGTGGCCGGG ATGATTGGCA CCGTCGCGGC 564 0 
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CGTCGTAATT GCGGGTAGCT CACCG CTGTG GAG C CGACAG CTATTCGTCG AGGGGCGCTG 
GCTGTCCGTG GGGCTACTCA GCGTTGGGGT GGCCGGGTTC TGCGCGCAGG CGACCCTGCT 
GGGCGCGCTG GCCGGCGTCG ACCGGTGGAC ACAGTACGGG TCACTGATGG TGACCGACGC 



CTTGTGGGCC GCCACCGCGG GAGCGGTGGC GTGGCTGCTC ATGCTGATGG CCTCGCCCAC 
CGCGCGCAGC GCGGCCAGCC TGCTGACGCC CGGGGGAATC GCCACGTTCG TGCGCGGTGC 
CGCTCATTCG ATAACCGCCG CGGGTGCCAG CGCGATTCTG GTAATGGGTT TCCCAGTGTT 
GCTCAAAGTG ACCTCCGACC AGTTAGGGGC AAAGGGCGGA GCGGTCATCC TGGCTGTGAC 
CTTGACGCGT GCGCCGCTTC TGGTCCCACT GAGCGCGATG CAAGGCAACC TGATCGCGCA 
TTTCGTCGAC CGGCGCACCC AACGGCTTCG GGCGCTGATC GCACCGGCGC TGGTCGTCGG 
CGGCATCGGT GCGGTCGGGA TGTTGGCCGC AGGGCTTACC GGTCCCTGGT TGCTGCGTGT 
TGGATTCGGC CCCGACTACC AAACTGGCGG GGCGTTGCTG GCCTGGTTGA CGGCAGCGGC 
GGTAGCTATC GCCATGCTGA CGCTGACCGG CGCCGCCGCG GTCGCGGCCG CACTGCACCG 
GGCGTATTTG CTGGGCTGGG TCAGCGCGAC GGTGGCGTCG ACGCTGTTGC TGCTGCTGCC 
GATGCCGCTG GAGACGCGCA CCGTGATCGC GCTGTTGTTC GGTCCAACGG TGGGAATCGC 
CATCCATGTG GCCGCGTTGG CGCGGCGACC CGACTGATTT GTGCCCCAGG TCGACAAATC 
ACGCCGTCTC GTCAGTGAGC ACTCCGTCCT CGGGTCCGAT CCTTCCAGGA GACGTTGCAA 
CCTGATTTGG CTCAAATTGG TG CGCACCGA GGGTCGGGCA CATCGTAGGG TCGCAACAGT 
CACATGTGTC ACTGCACCGG GCGACACCCG ATGTCCCGGC TCTCAGCGAC AGCTGTCTGA 
CCTGTGGTTT TGTTCCCAAG TTGGTCGTGG CTGTGCGGGA TTGGAGGTGG CGTGGGGGTC 
GCGTCGTATG GATTCTCCTC CTCGGTTCCG CGCGAAACGG CCGCAGGCGC AATGGTCACC 
AACTTGGCCG CGGTGGAGTC TAGCCTCACA TTTTCCTGGT CGCCCCCGAC AACCAGGAGG 
TCGCTGCAGA ACGGG CGTTC CCTACCCACA TCTACTATGA AGCGACAGCG GCGCCCCGCT 
GTGATGGCTG AGCATGACCG ACAGAGGCGG GAAGACAGTG AAGCGAGCGC TCATCACCGG 
AATCACCGGC CAGGACGGCT CGTATCTCGC CGAACTGCTG CTGGCCAAGG GGTATGAGGT 
TCACGGGCTC ATCCGGCGCG CTTCGACGTT CAACACCTCG CGGATCGATC ACCTCTACGT 
CGACCCGCAC CAACCGGGCG CGCGGCTGTT TCTGCACTAT GGTGACCTGA TCGACGGAAC 
CCGGTTGGTG ACCCTGCTGA GCACCATCGA ACCCGACGAG GTGTACAACC TGGCGGCGCA 
GTCACACGTG CGGGTGAGCT TCGACGAACC CGTGCACACC GGTGACACCA CCGGCATGGG 



5700 

5760 

5820 

5880 

5940 

6000 

6060 

6120 

6180 

6240 

6300 

6360 

6420 

6480 

6540 

6600 

6660 

6720 

6780 

6840 

6900 

6960 

7020 

7080 

7140 

7200 

7260 

7320 

7380 
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ATCCATGCGA 


CTGCTGGAAG 


CCGTTCGGCT 


CTCTCGGGTG 


CACTGCCGCT 


TCTATCAGGC 


7440 


GTCCTCGTCG 


GAG AT GTTCG 


GCGCCTCGCC 


GCCACCGCAG 


AACGAGCTGA 


CGCCGTTCTA 


7500 


CCCGCGGTCA 


CCGTATGGCG 


CCGCCAAGGT 


CTATTCGTAC 


TGGGCGACCC 


GCAATTATCG 


7560 


CGAAGCGTAC 


GGATTGTTCG 


CCGTTAACGG 


CATCTTGTTC 


AATCACGAAT 


CACCGCGGCG 


7620 


CGGTGAGACG 


TTCGTGACCC 


GAAAGATCAC 


CAGGGCCGTG 


GCACGCATCA 


AGGCCGGTAT 


7680 


CCAGTCCGAG 


GTCTATATGG 


GCAATCTGGA 


TGCGGTCCGC 


GACTGGGGGT 


ACGCGCCCGA 


7740 


ATACGTCGAA 


GGCATGTGGC 


GGATGCTGCA 


GACCGACGAG 


CCCGACGACT 


TCGTTTTGGC 


7800 


GACCGGGCGC 


GGTTTCACCG 


TGCGTGAGTT 


CGCGCGGGCC 


GCGTTCGAGC 


ATGCCGGTTT 


7860 


GGACTGGCAG 


CAGTACGTGA 


AATTCGACCA 


ACGCTATCTG 


CGGCCCACCG 


AGGTGGATTC 


7920 


GCTGATCGGC 


GACGCGACCA 


AGGCTGCCGA 


ATTGCTGGGC 


TGGAGGGCTT 


CGGTGCACAC 


7980 


TGACGAGTTG 


GCTCGGATCA 


TGGTCGACGC 


GGACATGGCG 


GCGCTGGAGT 


GCGAAGGCAA 


8040 


GCCGTGGATC 


GACAAGCCGA 


TGATCGCCGG 


CCGGACATGA 


ACGCGCACAC 


CTCGGTCGGC 


8100 


CCGCTTGACC 


GCGCGGCCCG 


GGTCTACATC 


GCCGGGCATC 


GCGGCCTGGT 


CGGGTCCGCG 


8160 


CTGCTACGCA 


CGTTTGCGGG 


CGCGGGGTTC 


ACCAACCTGC 


TGGTGCGGTC 


ACGCGCCGAG 


8220 


CTTGATCTGA 


CGGATCGGGC 


CGCGACGTTC 


GACTTCGTTC 


TCGAGTCGAG 


GCCGCAGGTC 


8280 


GTCATCGACG 


CGGCGGCCCG 


GGTCGGCGGC 


ATCCTGGCCA 


ACGACACCTA 


CCCGGCCGAT 


8340 


TTCCTGTCGG 


AAAACCTCCA 


GATCCAGGTC 


AACCTGCTGG 


ATGCCGCCGT 


GGCGGCGCGG 


8400 


GTGCCGCGGC 


TGCTGTTCCT 


GGGCTCGTCG 


TGCATCTACC 


CGAAACTCGC 


CCCGCAGCCG 


8460 


ATCCCGGAGA 


GCGCGCTGCT 


CACCGGTCCG 


TTGGAGCCGA 


CCAACGACGC 


GTACGCGATC 


8520 


GCCAAAATCG 


CCGGCATCCT 


TGCGGTCCAG 


GCGGTGCGCC 


GCCAACATGG 


CCTGCCGTGG 


8580 


ATCTCGGCGA 


TGCCCACCAA 


CCTGTACGGG 


CCAGGCGACA 


ACTTTTCGCC 


GTCCGGCTCG 


8640 


CATCTGCTGC 


CGGCACTCAT 


CCGCCGCTAT 


GACGAGGCCA 


AAGCCAGTGG 


CGCGCCCAAC 


8700 


GTGACCAACT 


GGGGCACCGG 


CACGCCCCGA 


CGGGAGTTGC 


TGCACGTCGA 


CGACCTGGCG 


8760 


AGCGCATGCC 


TGTATCTGCT 


GGAACATTTC 


GACGGGCCGA 


CCCATGTCAA 


CGTGGGAACC 


8820 


GGCATCGACC 


ACACCATCGG 


CGAGATCGCC 


GAGATGGTCG 


CCTCGGCGGT 


AGGCTATAGC 


8880 


GGCGAAACCC 


GCTGGGATCC 


AAGCAAACCG 


GACGGAACAC 


CACGCAAACT 


GCTGGATGTT 


8940 


TCGGTGCTAC 


GGGAGGCGGG 


ATGGCGGCCT 


TCGATCGCGC 


TGCGCGACGG 


CATCGAGGCG 


9000 


ACGGTGGCGT 


GGTATCGCGA GCACGCGGGA' ACGGTTCGGC 


AATGAGGCTG 


GCCCGTCGCG 


9060 
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CTCGGAACAT CTTGCGTCGC AACGGCATCG AGGTGTCGCG CTACTTTGCC GAACTGGACT 
GGGAACGCAA TTTCTTGCGC CAACTGCAAT CGCATCGGGT CAGTGCCGTG CTCGATGTCG 
GGGCCAATTC GGGGCAGTAC GCCAGGGGTC TGCGCGGCGC GGGCTTCGCG GGCCGCATCG 
TCTCGTTCGA GCCGCTGCCC GGGCCCTTTG CCGTCTTGCA GCGCAGCGCC TCCACGGACC 
CGTTGTGGGA ATGCCGGCGC TGTGCGCTGG GCGATGTCGA TGGAACCATC TCGATCAACG 
TCGCCGGCAA CGAGGGCGCC AGCAGTTCCG TCTTGCCGAT GTTGAAACGA CATCAGGACG 
CCTTTCCACC AGCCAACTAC GTGGGCGCCC AACGGGTGCC GATACATCGA CTCGATTCCG 
TGGCTGCAGA CGTTCTGCGG CCCAACGATA TTGCGTTCTT GAAGATCGAC GTTCAAGGAT 
TCGAGAAGCA GGTGATCGCG GGTGGCGATT CAACGGTGCA CGACCGATGC GTCGGCATGC 
AGCTCGAGCT GTCTTTCCAG CCGTTGTACG AGGGTGGCAT GCTCATCCGC GAGGCGCTCG 
ATCTCGTGGA TTCGTTGGGC TTTACGCTCT CGGGATTGCA ACCCGGTTTC ACCGACCCCC 
GCAACGGTCG AATGCTGCAG GCCGATGGCA TCTTCTTCCG GGGCAGCGAT TGACGCGCCG 
GCGCGTCAAT CTATTTCGAC ATT CGCGTGA AGACGTTTTC CCAGAATCGA CTGTTGTAGG 
CGTAGAACTC CCGGCCGCGT AGGTAGGCAT GTGATATTCG CCTTCCCCCG AACGGGTAGC 
GGCGATGAAG GTCGCCCATG CGGCGCAGAT CACCGAAGAC CGCGCTTGGT TCCCGGTGCG 
AGCCGACGCC CGTGGTGTCG AACTCGCACA GCACACACCG AATCGTGACC GGCTCGCATA 
CCAGCGCGGC CCGCAATATG AATTCCTGGT CGGCGGCGAT CCCGAAATCA AGGTCGTAGC 
CAC CGATCTT GGCCACCAGC GATGATCCGA AGAACGATGC TTGATGCGGA ACAACCTGCT 
TGCCGGCCAG GAATTTGCGC AGGCTGAAAG GTATCGGGCC GCGCACCCGA TCGAGCCCGA 
CGAGACGATC CATCCCGAAG CCCCACAATT CGGACACCGG TCCCTTGCCG GATAGCGCCT 
CCACGGCCTG GGCTACCACG TCGGGCCCGG AAAAACGATC GGCGGAGTGC AAGAACCACA 
ACAGATCACC CGATGCGTGC GCGATGCCCT GGTTCATCGC GTCGTACCGC CCGCCGTCGG 
GCTCGGACTG CCAATACGCG AAGCCTGGTT CACACCCGGA CAGGTATGCC ACCACGTCGT 
CGCCGCTGCC ACCGTCGATT ACGATGTGCT CGATGCGTCC CCGGTAGCGT TGCGCCCGCA 
CACTTTTCAC CGTGCGCTGC AACCCGTCGA GGTCGTTGAA CGAGATCGTT ATCACCGAGA 
CGGTCGGAGC AGACGTCACC GAGTTCCCCT AGGTTGCTGG CGGCGATTGT GGATC AC CGG 
GTCTTGATAC CGATGAAGGT GCCTCGAAGA TTCGCCGCAT AGGAACCTCC GAG C AACGAC 
TCGGCGATGC TTGGTTCCAA GTTGTCGTAC TCCTCCATCA CCAGGTCGAC GCCGACGTCT 
TTGATGGCCT GAAGTAGGTG CTCGCGTTGA ATCCAGAATG ACCGGCGATT GTCCCAGGAC 



9120 
9180 
9240 
9300 
9360 
9420 
9480 
9540 
9600 
9660 
9720 
9780 
9840 
9900 
9960 
10020 
10080 
10140 
10200 
10260 
10320 
10380 
10440 
10500 
10560 
10620 
10680 
10740 
10800 
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10860 
10920 
10980 
11040 
11100 
11160 
11220 
11280 
11340 
11400 
11460 
11520 



GCCCATTTTG CGGTGTCGCG CTGGCCAAAC GAGCGGTCGT CGGAAAACTC GGTAAACCAC 
CTACCGGGAA GTCCCTCATG TTCGGTGGGC GCCGAGAGCA TGAAC TTCAC CGGCGCCGGC 
CGCCGCAGCA ACCGATCGGT CAATTGTCGT GCCGTCGTGG GCAACCGGAG CCATTTATCG 
CTCCGGTTGA TGATCGAGAA GTGCGTCTGG AGAATCAGCA GCTTGTTCGT TACCGACGAG 
AGGGTTTCCA GGTATTGCTT CGGATTCTCC AGGTGGTAGA AGAGGCCGCA GCAGAAGACG 
GTATCGAAGA GCCCGTGGTT GGCGATGTTG AGGGCGTTGT CGTGGACGAA CCGGAGATTC 
GGCAGGTTGG TCTTCGATTT GATGTAGTTG CAGGCCGCCA TGTTCAGCTC GCGAACCTCG 
ATCCCGAGGA CCTGAAATCC CATGCGCGCG AACCCGACCG CGTACCCGCC TTCCAAGCAG 
CCGACATCGG CCAGGCGTAG GTGGCTCTTG TCCCCGGGAA AGACGGTTTC CAGAATCCCG 
CGCGCCGAGA TGAACCAGGA CGATTCGTCT AACGTGCGCG AGGACTCCGG TATCGTCAAG 
GTTCCGTCGT CGAGGCGAAC GTTGTGGGCG GTGAATTGTA CCGCGCCGGC CGAATGTTCC 
TGTGCCATCA CTTGGTTAGC CCCTTCGGCT GGTCCTGGGT TTGTCGACAT GGTCAGGCTC 
GACAGCCGCG TCGGAGCCGG GAGGGCCACA CATCCACGAG CCCCCTGCGG CTCGGCGTCG 1158 0 
CGGCGGCGAG CTTGCGCCAC TGGGTCTTGA GCCGCCGCGC GGGTGTCGCC CCGCGGTGCT 11640 
GCAGCGCCAG CATGGCGATC CGGGGATGGC GCGCGATGGT TTCCTGCAGC GCGGCGCGCC 11700 
CCTCCGGGCC TGGAACGTTG GCGATCTGGC GAAGGATCCA GTCGGCCATG ACGGCGATGA 11760 
GCTCCTCGCG CGCGGGGTCT CCCGGGAACA GGTCGAGCAT CGCGTCAAAC GTCGCCGCAT 11820 
GCCCCGGACC CTGCGTCAAC CAGAACTTTG GCGGGTCCAC CACCTGGTTG TGCCACATGC 11880 
CTTGGGCGTG GCGGCGATAC ACGGCCATGG TGTCGGGCAA CATGGCGATC TCGCCATGCA 11940 
CCGCGTGCCG GACGTGCAGA TACCAGTCCA GGGGCATGAC GTCGGCAGGA ATGTCGTCGT 12000 
AGCGCTCGAG GCGACGGTAC ACGGCCGAGT TGGTCTGGAT GAAGTTCATC AAGATCAACG 12060 
CATCCAGGCT CAAGTTGCCC CGCACCCGAA CCGGGGGGAA CTTCGAGTCC TTGGCATGGC 12120 
CGTCCTCCCA TATCACTCGG ACGGGATGGA AGCACACCGT CGTCTTGGGG TGCCGGTCGA 12180 
GGAATGCGAC CTGTTTGCTT AGCTTCAGCG GATCGATCCA GTAGTCGTCC GCCTCGCACA 12240 
ACGCGACGTA CTCGCCGCGA GCGGCCGACA GGGCGCCGGT CAGGTTCCCA TTGAGGCCGA 
GGTTTTCGGT CCTGAAGATC GGCCGGAACA CGTG CGGGTA CCGCTCGGCG TACTCACGGA 
TGATCGCCGG GGTGGCATCG GTCGACGCGT CGTCGGCGAC GATGATCTCC ACCGGGAAGT 



12300 
12360 



12420 



CGGTTTGCTG GTCGAGAAAG CTGTCGAAGG' CCTGACGGGC GTAGCCCGCC TGGTTGTGAG 124 80 
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TGGTCGAGAC GATGCTCACC TTGGGGCAAA GCTGGGGACT CACCGTCGGC CCTTTTCCTG 12 540 

CGCGGCCGCA AGGGTATTGC GATGGCGAAC GTGAATCGCC TGTGCCCGCC GGCCGTCGGC 12 6 00 

CGTCGTGGCC TGGTGGTCGG CGGACGTACG GCACACGCTG G CGAAGT ATA GCGAGGGTGC 12 66 0 

ACTGACGTTG GGCTCGAACC GCGTGGCGCG CGGTGTGGGC GCACCGTCTC GAGTCGGTGC 12 720 

TGGTTGGCTC GC 12732 
(2) INFORMATION FOR SEQ ID NO : 2: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 9 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

<ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 2: 
ATACTCAAGC TTGCCGCAAT CGAAACCAAC CTGTTTGTGC CG CAAGAAAT TACGCCGTGG 60 
CCCGGCGCCG ATCAAGAAAC GCCCCGGCGC GCGGCGGTGT CG TCGTATGG CATGACGGGC 12 0 

ACCAATGTGC ACGCCATTGT CGAGCAGGCA CCGGTGCCAG CCCCCGAATC CGGTGCACCA 180 
GGCGACACCC CGGCCACACC CGGTATCGAC GGCGCGCTGC TGTTCGCGCT GTCGGCCAGC 240 
TCGCAGGACG CGCTGCGGCA AACCGCCGCG CGGCTGGCCG ATTGGGTCT 289 
(2) INFORMATION FOR SEQ ID NO : 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 278 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI - SENSE : NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
TTGGCGGGTT GGCCACACAC CCGCCGGTGA CGGCGACGAT GCTGGGCTGG TTGCGGCCCT 6 0 

GCGCCACCGC GGCTTGCATG CTGGTTGGCT GTCTTGGGAC GATCCCGAAA TAGTCCACGC 120 
GGATCTGGTG ATTTTGCGGG CTACCCGCGA TTACCCCGCG CGGCTCGACG AGTTTTTGGC 180 
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CTGGACTACC CGCGTGGCCA ATCTGCTGAA CTCGCGGCCG GTGGTGGCCT GGAATGTCCA 24 0 

CGCCGTTCAC CTACGTGACC TTGATGGGAT CCGGGGGT 278 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 128 0 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANT I- SENSE: NO 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

CCGACCCAGA CACTGACCGG GCGACCGCTG ATCGGCAACG GCACCCCCGG GGCGGTCGGC 6 0 

AGCGGGG CCA CCGGGGCCCC CGGTGGGTGG CTGCTCGGCG ACGGCGGGGC CGGCGGGTCC 12 0 

GGCGCGGCGG GCTCGGGCGC GCCCGGCGGG GCGGGCGGGG CTGCCGGGCT GTGGGGTACC 18 0 

GGCGGGGCCG GCGGGATCGG CGGAGCCAGC ACCGTACTCG GCGGC AC CGG CGGGGGAGGC 24 0 

GGGG TCGGTG GGCTGTGGGG CGCCGGTGGG GCCGGCGGGG CCGGTGGAAC CGGCCTTGTT 3 00 

GGTGGCGACG GCGGGGCCGG TGGGGCCGGC GGGACCGGCG GACTGCTGGC CGGGCTGATC 36 0 

GGTGCCGGCG GAGGTCACGG CGGGACCGGC GGGCTCAGCA CTAATGGCGA CGGCGGGGTT 420 

GGCGGGGCCG GCGGGAATGC CGGAATGCTC GCCGGGCCGG GCGGCGCCGG CGGAGCCGGC 48 0 

GGTGACGGCG AAAACCTGGA CACCGGTGGG GACGGCGGGG CCGGCGGTAG CGCAGGGCTG 54 0 

CTGTTCGGCA GCGGCGGCGC CGGCGGCGCC GGCGGATTTG GTTTCCTCGG TGGGGACGGC 600 

GGGGCCGGTG GCAACGCCGG GCTGCTGTTG TCCAGCGGCG GGGCCGGCGG GTTCGGCGGG 66 0 

TTCGGCACCG CCGGTGGGGT CGGTGGGGCC GGCGGCAATG CCGGCTGGCT GGGCTTCGGC 72 0 

GGGGCCGGGG GCATCGGCGG AATCGGCGGT AACGCTAACG GGGGCG CCGG TGGGAACGGC 780 

GGCACCGGCG GTCAGTTATG GGGTAGCGGC GGCGCpGGCG TCGAAGGCGG CGCAGCCTTA 84 0 

AGCGTCGGCG ACACCGGCGG GGCCGGTGGC GTCGGCGGCA GCGCCGGGCT GATCGGCACC 900 

GGCGGCAACG GCGGCAACGG CGGCACCGGC GCCAACGCCG GCAGCCCCGG AACCGGCGGC 96 0 

GCCGGCGGGT TGCTGCTGGG CCAAAACGGG CTCAACGGGT TGCCGTAGCC GGGCGGCACG 102 0 

GCATGGCTTC CGGGCGTCAA CCACTCGCCG *GTGATGCAGA TCGGCTGCGG AGCGGGCCGC 108 0 
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CAAAATGGGG GCCGCCGCGC CAGGTATCTC GGCGAAGATC CCCGGCGCTC GAGCGCTTTG 
TCAGAGGCCC GTCGCGGGTC GTCGTGACGA CGGCTATCCG GGCGGTGCGG GTTTCGCGGC 
GCGCCCTGTG CCCGGCACCG CCGCCCGTTT GTCGGCAACG CCGCCGCGAC CCGTGAGCCG 
TCCAGCAGCT GGCGCCTGCG 
(2) INFORMATION FOR SEQ ID NO : 5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 12 7 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE : DNA (genomic) 
( i i i ) HYPOTHET I CAL : NO 
(iv) ANTI- SENSE: NO 

(xi> SEQUENCE DESCRIPTION: SEQ ID NO: 5: 
GGGCATCGGC GGAATCGGCG GTAACGCTAA CGGGGGCGCC GGTGGGAACG GCGGCACCGG 
CGGTCAGTTA TGGGGTAGCG GCGGCGCCGG CGTCGAAGGC GGCGCAGCCT TAAGCGTCGG 
CGACACC 
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